{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "f58Xwgoziq5R"
   },
   "source": [
    "![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)\n",
    "\n",
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/sequence2sequence/NLU_Medical_TextGenerators.ipynb)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bxhiePZyoqC5"
   },
   "source": [
    "# **Medical Text Generator**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "U78yrnQdiq5S"
   },
   "source": [
    "MedicalTextGenerator uses the basic BioGPT model to perform various tasks related to medical text abstraction. With this annotator, a user can provide a prompt and context and instruct the system to perform a specific task, such as explaining why a patient may have a particular disease or paraphrasing the context more directly. In addition, this annotator can create a clinical note for a cancer patient using the given keywords or write medical texts based on introductory sentences. The BioGPT model is trained on large volumes of medical data allowing it to identify and extract the most relevant information from the text provided."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8jdcy5Whiq5S"
   },
   "source": [
    "**All the models avaiable are :**\n",
    "\n",
    "| Language | nlp.load() reference                                    | Spark NLP Model reference                                                                                                         |\n",
    "|----------|---------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------|\n",
    "| English  | en.generate.biomedical_biogpt_base | [text_generator_biomedical_biogpt_base](https://nlp.johnsnowlabs.com//2023/04/03/text_generator_biomedical_biogpt_base_en.html) |\n",
    "| English  | en.generate.generic_flan_base      | [text_generator_generic_flan_base](https://nlp.johnsnowlabs.com//2023/04/03/text_generator_generic_flan_base_en.html)           |\n",
    "| English  | en.generate.generic_jsl_base       | [text_generator_generic_jsl_base](https://nlp.johnsnowlabs.com//2023/04/03/text_generator_generic_jsl_base_en.html)             |\n",
    "| English  | en.generate.generic_flan_t5_large  | [text_generator_generic_flan_t5_large](https://nlp.johnsnowlabs.com//2023/04/04/text_generator_generic_flan_t5_large_en.html)   |\n",
    "| English  | en.generate.biogpt_chat_jsl                       | [biogpt_chat_jsl](https://nlp.johnsnowlabs.com//2023/04/12/biogpt_chat_jsl_en.html)                                             |\n",
    "| English  | en.generate.biogpt_chat_jsl_conversational        | [biogpt_chat_jsl_conversational](https://nlp.johnsnowlabs.com//2023/04/18/biogpt_chat_jsl_conversational_en.html)               |\n",
    "| English  | en.generate.biogpt_chat_jsl_conditions            | [biogpt_chat_jsl_conditions](https://nlp.johnsnowlabs.com//2023/05/11/biogpt_chat_jsl_conditions_en.html)                       |\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:17:03.816199Z",
     "iopub.status.busy": "2023-10-22T19:17:03.815800Z",
     "iopub.status.idle": "2023-10-22T19:17:19.577397Z",
     "shell.execute_reply": "2023-10-22T19:17:19.576750Z",
     "shell.execute_reply.started": "2023-10-22T19:17:03.816175Z"
    },
    "id": "e5uMpzwBrsf4",
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Install the johnsnowlabs library\n",
    "! pip install -q johnsnowlabs==5.1.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "8Nk2yIrtsP8N"
   },
   "outputs": [],
   "source": [
    "from google.colab import files\n",
    "print('Please Upload your John Snow Labs License using the button below')\n",
    "license_keys = files.upload()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:19:16.454838Z",
     "iopub.status.busy": "2023-10-22T19:19:16.454276Z",
     "iopub.status.idle": "2023-10-22T19:20:23.964600Z",
     "shell.execute_reply": "2023-10-22T19:20:23.964072Z",
     "shell.execute_reply.started": "2023-10-22T19:19:16.454819Z"
    },
    "id": "SNoZyC59dqbV",
    "tags": []
   },
   "outputs": [],
   "source": [
    "from johnsnowlabs import nlp\n",
    "\n",
    "# After uploading your license run this to install all licensed Python Wheels and pre-download Jars the Spark Session JVM\n",
    "nlp.install()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:29:43.244567Z",
     "iopub.status.busy": "2023-10-22T19:29:43.244086Z",
     "iopub.status.idle": "2023-10-22T19:29:43.284456Z",
     "shell.execute_reply": "2023-10-22T19:29:43.284001Z",
     "shell.execute_reply.started": "2023-10-22T19:29:43.244544Z"
    },
    "id": "RcbDKFdjSIzP",
    "tags": []
   },
   "outputs": [],
   "source": [
    "from johnsnowlabs import nlp\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "\n",
    "spark=nlp.start()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "V1wGsA5Eiq5U"
   },
   "source": [
    "## 📑  **en.generate.biomedical_biogpt_base**\n",
    "\n",
    "This model is a BioGPT (LLM) based text generation model that is finetuned with biomedical datasets (Pubmed abstracts) by John Snow Labs.  Given a few tokens as an intro, it can generate human-like, conceptually meaningful texts  up to 1024 tokens given an input text (max 1024 tokens)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 860
    },
    "execution": {
     "iopub.execute_input": "2023-10-22T19:24:00.788433Z",
     "iopub.status.busy": "2023-10-22T19:24:00.788101Z",
     "iopub.status.idle": "2023-10-22T19:24:09.199496Z",
     "shell.execute_reply": "2023-10-22T19:24:09.198991Z",
     "shell.execute_reply.started": "2023-10-22T19:24:00.788414Z"
    },
    "id": "f3E4RBmTs1gc",
    "outputId": "d3b89832-e113-4876-cde6-f676303826cd",
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning::Spark Session already created, some configs may not take.\n",
      "Warning::Spark Session already created, some configs may not take.\n",
      "text_generator_biomedical_biogpt_base download started this may take some time.\n",
      "[OK!]\n",
      "sentence_detector_dl download started this may take some time.\n",
      "Approximate size to download 354.6 KB\n",
      "[OK!]\n",
      "Warning::Spark Session already created, some configs may not take.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                                \r"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Covid 19 is a pandemic that has affected the world &apos;s economy and health. The World Health Organization ( WHO ) has declared the pandemic a global emergency. The'"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = nlp.load('en.generate.biomedical_biogpt_base').predict('Covid 19 is')\n",
    "df.generated.iloc[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "aQedbnMWkkCp"
   },
   "source": [
    "## 📑  **en.generate.biogpt_chat_jsl_conversational**\n",
    "\n",
    "This model is based on BioGPT finetuned with medical conversations happening in a clinical settings and can answer clinical questions related to symptoms, drugs, tests, and diseases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 860
    },
    "execution": {
     "iopub.execute_input": "2023-10-22T19:29:53.601546Z",
     "iopub.status.busy": "2023-10-22T19:29:53.601076Z",
     "iopub.status.idle": "2023-10-22T19:29:56.195958Z",
     "shell.execute_reply": "2023-10-22T19:29:56.195401Z",
     "shell.execute_reply.started": "2023-10-22T19:29:53.601523Z"
    },
    "id": "BR03PDROpZpf",
    "outputId": "a6f83288-621c-4420-89e1-a4d54e35292f",
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning::Spark Session already created, some configs may not take.\n",
      "Warning::Spark Session already created, some configs may not take.\n",
      "biogpt_chat_jsl_conversational download started this may take some time.\n",
      "[OK!]\n"
     ]
    }
   ],
   "source": [
    "model = nlp.load('en.generate.biogpt_chat_jsl_conversational')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:29:56.197280Z",
     "iopub.status.busy": "2023-10-22T19:29:56.196873Z",
     "iopub.status.idle": "2023-10-22T19:29:56.200775Z",
     "shell.execute_reply": "2023-10-22T19:29:56.200349Z",
     "shell.execute_reply.started": "2023-10-22T19:29:56.197261Z"
    },
    "id": "fAi9uvo3peH_",
    "outputId": "4de55d8a-cd15-4388-e38d-b180609c8f05",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "MedicalTextGenerator_9838e26f3846"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model['med_text_generator'].setMaxNewTokens(100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:29:56.938617Z",
     "iopub.status.busy": "2023-10-22T19:29:56.938008Z",
     "iopub.status.idle": "2023-10-22T19:29:56.941234Z",
     "shell.execute_reply": "2023-10-22T19:29:56.940763Z",
     "shell.execute_reply.started": "2023-10-22T19:29:56.938592Z"
    },
    "id": "hWbW7LEUsMiG",
    "tags": []
   },
   "outputs": [],
   "source": [
    "data = ['How to treat asthma ?','How to treat common cold?']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:29:57.651858Z",
     "iopub.status.busy": "2023-10-22T19:29:57.651402Z",
     "iopub.status.idle": "2023-10-22T19:30:08.947796Z",
     "shell.execute_reply": "2023-10-22T19:30:08.947300Z",
     "shell.execute_reply.started": "2023-10-22T19:29:57.651836Z"
    },
    "id": "Vf0vbe7UpyZ3",
    "outputId": "050279b8-2e34-466f-c34c-8cfe98dfddbb",
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sentence_detector_dl download started this may take some time.\n",
      "Approximate size to download 354.6 KB\n",
      "[OK!]\n",
      "Warning::Spark Session already created, some configs may not take.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "                                                                                \r"
     ]
    }
   ],
   "source": [
    "df = model.predict(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:30:08.948989Z",
     "iopub.status.busy": "2023-10-22T19:30:08.948682Z",
     "iopub.status.idle": "2023-10-22T19:30:08.953745Z",
     "shell.execute_reply": "2023-10-22T19:30:08.953325Z",
     "shell.execute_reply.started": "2023-10-22T19:30:08.948971Z"
    },
    "id": "lZoWyd2CnSiE",
    "outputId": "19fe94f1-d101-49e3-983c-fb4bc36b26d7",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>generated</th>\n",
       "      <th>sentence</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>How to treat asthma?. answer: Yes, you can def...</td>\n",
       "      <td>How to treat asthma ?</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>How to treat common cold? I am suffering from ...</td>\n",
       "      <td>How to treat common cold?</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                           generated  \\\n",
       "0  How to treat asthma?. answer: Yes, you can def...   \n",
       "1  How to treat common cold? I am suffering from ...   \n",
       "\n",
       "                    sentence  \n",
       "0      How to treat asthma ?  \n",
       "1  How to treat common cold?  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-22T19:30:08.954421Z",
     "iopub.status.busy": "2023-10-22T19:30:08.954258Z",
     "iopub.status.idle": "2023-10-22T19:30:09.049374Z",
     "shell.execute_reply": "2023-10-22T19:30:09.048942Z",
     "shell.execute_reply.started": "2023-10-22T19:30:08.954386Z"
    },
    "id": "w6ReENPIu995",
    "outputId": "fb02164b-9e6a-471c-e29f-db3f6d3bf398",
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "How to treat asthma?. answer: Yes, you can definitely take Montelukast as a maintenance therapy. Also, you can take an Albuterol inhaler if you have a difficult time breathing. If you have a cough, it will help. If you have no cough, then you can take a combination of montelukast and levocetirizine. If the symptoms are too troublesome, you can try using Budesonide. If the symptoms are too mild, you can also take a combination of Budesonide\n",
      "How to treat common cold? I am suffering from cough from last two weeks. I am taking azithral 200 and levolin 100. I am also taking azithral 500 and levolin 400 twice a day. I am also taking azithral 500 and levolin 400mg twice a day. answer: Common cold is caused due to viral infection and the most common treatment for it is decongestant syrup like oxymetazoline or oxymetazoline nasal spray. It is given for symptomatic relief and to\n"
     ]
    }
   ],
   "source": [
    "for i,j in df.iterrows():\n",
    "    print(j['generated'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "CH47DpkXvDwL"
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
